Multi-band summary correlogram-based pitch detection for noisy speech
نویسندگان
چکیده
A multi-band summary correlogram (MBSC)-based pitch detection algorithm (PDA) is proposed. The PDA performs pitch estimation and voiced/unvoiced (V/UV) detection via novel signal processing schemes that are designed to enhance the MBSC’s peaks at the most likely pitch period. These peak-enhancement schemes include comb-filter channel-weighting to yield each individual subband’s summary correlogram (SC) stream, and stream-reliability-weighting to combine these SCs into a single MBSC. V/UV detection is performed by applying a constant threshold on the maximum peak of the enhanced MBSC. Narrowband noisy speech sampled at 8 kHz are generated from Keele (development set) and CSTR – Centre for Speech Technology Research-(evaluation set) corpora. Both 4-kHz fullband speech, and G.712-filtered telephone speech are simulated. When evaluated solely on pitch estimation accuracy, assuming voicing detection is perfect, the proposed algorithm has the lowest gross pitch error for noisy speech in the evaluation set among the algorithms evaluated (RAPT, YIN, etc.). The proposed PDA also achieves the lowest average pitch detection error, when both pitch estimation and voicing detection errors are taken into account. 2013 Elsevier B.V. All rights reserved.
منابع مشابه
Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations
of the Dissertation Robust Speech and Bird Song Processing using Multi-band Correlograms and Sparse Representations by Lee Ngee Tan Doctor of Philosophy in Electrical Engineering University of California, Los Angeles, 2014 Professor Abeer Alwan, Chair This dissertation focuses on algorithms for robust speech and bird song processing. Many applications perform well under ideal signal conditions,...
متن کاملMonaural Voiced Speech Segregation Based on Pitch and Comb Filter
The correlogram is an important mid-level representation for periodic sounds which is widely used in sound source separation and pitch detection. However, it is very time consuming. In this paper, we presented a novel scheme for monaural voiced speech separation without computing correlograms. The noisy speech is firstly decomposing into time-frequency units. Pitch contour of the target speech ...
متن کاملPitch estimation of noisy speech signals using empirical mode decomposition
This paper presents a pitch estimation method of noisy speech signal using empirical mode decomposition (EMD). The normalized autocorrelation function (NACF) of the noisy speech signal is decomposed into a finite set of band-limited signals termed as intrinsic mode functions (IMFs) using EMD. The periodicity of one IMF is supposed to be equal to the accurate pitch period. A conventional autocor...
متن کاملPitch Tracking Based on Statistical Anticipation
An effective multi-pitch tracking algorithm for noisy speech is critical for auditory processing. However, the performance of existing algorithms is not satisfactory. We have developed a robust algorithm for multi-pitch tracking of noisy speech based on statistical anticipation. By combining an improved channel and peak selection method, a new integration method for extracting periodicity infor...
متن کاملAn Automatic Pitch Detection Method Based on Multi-feature for Mandarin Speech
There are many traditional pitch detection methods, but most of them can’t perform perfectly for different speakers, applications and environmental conditions. For this reason, a pitch detection method based on multi-feature is proposed. Firstly, the speech signals are pre-filtered. Secondly, the speech signal pre-filtered is segmented into syllables. Finally, the pitch period is obtained by wa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 55 شماره
صفحات -
تاریخ انتشار 2013